Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add concurrency to the memberlist transport's WriteTo method #525

Merged
merged 5 commits into from
Oct 10, 2024

Conversation

aldernero
Copy link
Contributor

@aldernero aldernero commented May 14, 2024

What this PR does:
This PR adds concurrency to the WriteTo function with a configurable maximum writes parameter to control the number of goroutines.

This allows the memberlist to keep working if one of the addresses it's trying to write to is unreachable

Which issue(s) this PR fixes:

Fixes #192

Checklist

  • Tests updated
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@julienduchesne julienduchesne self-assigned this Oct 2, 2024
@julienduchesne julienduchesne force-pushed the aldernero/nonblocking-memberlist-tcp branch from 3d073ca to 183d576 Compare October 2, 2024 22:58
@julienduchesne julienduchesne force-pushed the aldernero/nonblocking-memberlist-tcp branch 5 times, most recently from 9b48111 to 1807e0c Compare October 3, 2024 13:42
@julienduchesne julienduchesne marked this pull request as ready for review October 3, 2024 14:09
@julienduchesne julienduchesne requested a review from a team October 4, 2024 13:14
Copy link
Member

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't make WriteTo non-blocking -- only increases concurrency, but when all write goroutines are used, blocking will still occur. We should clarify that.

kv/memberlist/tcp_transport.go Outdated Show resolved Hide resolved
kv/memberlist/tcp_transport.go Outdated Show resolved Hide resolved
@julienduchesne
Copy link
Member

This doesn't make WriteTo non-blocking -- only increases concurrency, but when all write goroutines are used, blocking will still occur. We should clarify that

I considered giving it a buffered channel which would push back the blocking part. It would still be blocking to a point

An unlimited amount of goroutines could cause unexpected problems

@pstibrany
Copy link
Member

pstibrany commented Oct 4, 2024

I considered giving it a buffered channel which would push back the blocking part. It would still be blocking to a point

An unlimited amount of goroutines could cause unexpected problems

Yes, I think the only safe way to make it non-blocking would be to drop a packet if workers are busy and buffer is full (when used). Having unbounded number of goroutines or unlimited buffer could lead to memory exhaustion.

@julienduchesne julienduchesne changed the title make WriteTo non-blocking Add concurrency to memberlist's WriteTo method Oct 4, 2024
@julienduchesne julienduchesne changed the title Add concurrency to memberlist's WriteTo method Add concurrency to the memberlist transport's WriteTo method Oct 4, 2024
@julienduchesne
Copy link
Member

I've instead renamed the PR and changelog. While, semantically, it doesn't really make it non-blocking, it does resolve the core issue which #192 was meant to solve, would you agree?

@julienduchesne julienduchesne force-pushed the aldernero/nonblocking-memberlist-tcp branch from 55cb80a to b950353 Compare October 8, 2024 12:54
@julienduchesne julienduchesne requested a review from a team October 8, 2024 12:54
julienduchesne added a commit that referenced this pull request Oct 8, 2024
It randomly failed here: #525 and in some of the test runs here: #596

This test fails due to the probe interval being 5s, which is the same time we poll for
kv/memberlist/tcp_transport.go Show resolved Hide resolved
kv/memberlist/tcp_transport.go Outdated Show resolved Hide resolved
kv/memberlist/tcp_transport.go Outdated Show resolved Hide resolved
kv/memberlist/tcp_transport_test.go Outdated Show resolved Hide resolved
julienduchesne added a commit that referenced this pull request Oct 8, 2024
It randomly failed here: #525 and in some of the test runs here: #596

This test fails due to the probe interval being 5s, which is the same time we poll for
@julienduchesne julienduchesne force-pushed the aldernero/nonblocking-memberlist-tcp branch from b950353 to a01997d Compare October 8, 2024 22:13
@julienduchesne
Copy link
Member

/find-flaky-tests

@julienduchesne julienduchesne force-pushed the aldernero/nonblocking-memberlist-tcp branch from a01997d to 9155c11 Compare October 9, 2024 00:49
Copy link
Member

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for addressing my feedback.

kv/memberlist/tcp_transport.go Show resolved Hide resolved
@julienduchesne julienduchesne force-pushed the aldernero/nonblocking-memberlist-tcp branch from 3427e65 to 452abc0 Compare October 9, 2024 19:04
- Create goroutines and keep them while the TCPTransport is alive. End them on the `Shutdown` function
- Add `TestTCPTransportWriteToUnreachableAddr` test to check that writing is not blocking anymore (without this PR, it takes `writeCt * timeout` to run and it fails)
- Rename CHANGELOG
- Mutex lock on shutdown rather than write
- Wait when workers are ended rather than for each write
- Move variables around
- Add timeout before dropping requests. This prevents blocking on the `WriteTo` function
@julienduchesne julienduchesne force-pushed the aldernero/nonblocking-memberlist-tcp branch from 452abc0 to ecd4853 Compare October 10, 2024 14:56
@julienduchesne julienduchesne merged commit 879ff5a into main Oct 10, 2024
5 checks passed
@julienduchesne julienduchesne deleted the aldernero/nonblocking-memberlist-tcp branch October 10, 2024 15:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Proposal: make memberlist TCP Transport non blocking
3 participants